BY:

- **MOHAMMAD ALSHAYE** : 202065240

Discovery:

our data is in the field of computer science and electronics.

we took the data from Kaggle.

Data Preparation

Understanding Data

  1. Identifying the fields of the data.
    From the above table, we can see the following: Number of rows is 4854. Number of columns is 14. the fields are *Unnamed, Product, Type, Release Date, Process Size (nm), TDP (W), Die Size (mm^2), Transistors (million), Freq (MHz), Foundry, Vendor, FP16 GFLOPS, FP32 GFLOPS, FP64 GFLOPS .

  2. Identifying the type for each field based on value. Also, identifying the datatypes in Python. The following table gives the required information:

Field Type Description
Unnamed Numeric Index
Product Categorical Product
Type Categorical CPU or GPU
Release Date Categorical Release Date
Process Size (nm) Numeric Process Size in nanometers
TDP (W) Numeric Thermal Design Power in Watts
Die Size (mm^2) Numeric Die Size in squared millimeters
Transistors (million) Numeric Transistors in millions
Freq (MHz) Numeric Frequency in megahertz
Foundry Categorical the company that manufactured the chip
Vendor Categorical the company that designed the chip
FP16 GFLOPS Numeric Giga Floating-Point Operations per Second using floating-point (FP16) arithmetic
FP32 GFLOPS Numeric Giga Floating-Point Operations per Second using floating-point (FP32) arithmetic
FP64 GFLOPS Numeric Giga Floating-Point Operations per Second using floating-point (FP64) arithmetic

Finding Inconsistencies

Fixing Inconsistencies

Removing Outliers

Creating new column "Performance"

Model Planning:

a

B

Model Building:

A- Estimation & evaluation

Regression

Summery:

Classification

Summery :

Clustering

B - Comparison:

C - Interpretation :

Operationalize:

A:

B:

Communicate results:

A:

B: